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Abstract 

We present predictions for the statistical error due to finite sampling in the presence of thermal 
fluctuations in molecular simulation algorithms. Specifically, we establish how these errors de- 
pend on Mach number, Knudscn number, number of particles, etc. Expressions for the common 
hydrodynamic variables of interest such as flow velocity, temperature, density, pressure, shear 
stress and heat flux are derived using equilibrium statistical mechanics. Both volume-averaged 
and surface-averaged quantities are considered. Comparisons between theory and computations 
using direct simulation Monte Carlo for dilute gases, and molecular dynamics for dense fluids, 
show that the use of equilibrium theory provides accurate results. 

Keywords: Statistical error, sampling, fluctuations, Monte Carlo, hydrodynamics 

1 Introduction 

Recently much attention has been focused on the simulation of hydrodynamic problems at small scales 
using molecular simulation methods such as Molecular Dynamics (MD) HJ, ^] or the direct simulation 
Monte Carlo (DSMC) g §. Molecular Dynamics is generally used to simulate liquids while DSMC is 
a very efficient algorithm for simulating dilute gases. In molecular simulation methods the connection 
to macroscopic observable fields, such as velocity and temperature, is achieved through averaging 
appropriate microscopic properties. The simulation results are therefore inherently statistical and 
statistical errors due to finite sampling need to be fully quantified. 
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Though confidence intervals may be estimated by measuring the variance of these sampled quan- 
tities, this additional computation can be burdensome and thus is often omitted. Furthermore, it 
would be useful to estimate confidence intervals a priori so that one could predict the computational 
effort required to achieve a desired level of accuracy. For example, it is well known that obtaining 
accurate hydrodynamic fields (e.g., velocity profiles) is computationally expensive in low Mach num- 
ber flows so it is useful to have an estimate of the computational effort required to reach the desired 
level of accuracy. 

In this paper we present expressions for the magnitude of statistical errors due to thermal fluc- 
tuations in molecular simulations for the typical observables of interest, such as velocity, density, 
temperature, and pressure. We also derive expressions for the shear stress and heat flux in the dilute 
gas limit. Both volume averaging and flux averaging is considered, even though the measurement 
of the shear stress and heat flux through volume averaging is not exact unless the contribution of 
the impulsive collisions is accounted for. Although we make use of expressions from equilibrium 
statistical mechanics, the non-equilibrium modifications to these results are very small, even under 
extreme conditions ||. This is verified by the good agreement between our theoretical expressions 
and the corresponding measurements in our simulations. 

In addition to direct measurements of hydrodynamic fields, this analysis will benefit algorithmic 
applications by providing a framework were statistical fluctuations can be correctly accounted for. 
One example of such application is the measurement of temperature for temperature-dependent 
collision rates fTq| ; another example is the measurement of velocity and temperature for the purpose 
of imposing boundary conditions [|20|| . Additional examples include hybrid methods [jH], [l4|] where 
the coupling between continuum and molecular fields requires both averaging of finite numbers of 
particles in a sequence of molecular realizations as well as the generation of molecular realizations, 
based on continuum fields, with relatively small numbers of particles (e.g., buffer cells). 

In section 2 the theoretical expressions for the statistical error due to thermodynamic fluctuations 
are derived. These expressions are verified by molecular simulations, as described in section 3. The 
effect of correlations between samples in dilute gases is briefly discussed in section 4 and concluding 
remarks appear in section 5. 



2 Statistical error due to thermal fluctuations 

2.1 Volume-averaged quantities 

We first consider the fluid velocity. In a particle simulation, the flow field is obtained by measuring 
the instantaneous center of mass velocity, u, for particles in a statistical cell volume. The statistical 
mean value of the local fluid velocity, (u) s , is estimated over M independent samples. For steady 
flows, these may be sequential samples taken in time; for transient flows these may be samples from 
an ensemble of realizations. The average fluid velocity, (u), is defined such that (u) s —* (u) as 
M — > oo; for notational convenience we also write (u) = uo- Define 5u x = u x — u x q to be the 
instantaneous fluctuation in the x-component of the fluid velocity; note that all three components 
are equivalent. From equilibrium statistical mechanics |J, 

where No is the average number of particles in the statistical cell, Tq is the average temperature, m 
is the particle mass, k is Boltzmann's constant, a is the sound speed, and 7 = cp/cy is the ratio 
of the specific heats. The acoustic number Ac = a /a 1 is the ratio of the fluid's sound speed to the 
sound speed of a "reference" ideal gas at the same temperature 
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Note that this reference ideal gas has a ratio of specific heats (7*) equal to the original fluid specific 
heat ratio, that is 7* = 7 as shown in equation (|2|). 

An alternative construction of (|l]) is obtained from the equipartition theorem Q 

f/cTo = \m{\c - c | 2 ) = \m{{(c x - c x0 ) 2 + (c y - c yQ ) 2 + (c z - c z0 ) 2 )) = %m{(c x - c x0 ) 2 ) (3) 

where c is the translational molecular velocity. For a non-equilibrium system, this expression defines 
To as the average translational temperature. Note that uq = Cq and 

(\5u\ 2 ) = <' C - C °' 2 > = (4) 

The above expression also reminds us that the expected error in estimating the magnitude of the 
fluid velocity is v3 larger than in estimating a velocity component. 

We may define a "signal-to-noise" ratio as the average fluid velocity over its standard deviation; 
from the above, 

-^f\=AcM a VWo (5) 

where Ma = |«a;o|/ a is the local Mach number based on the velocity component of interest. This 
result shows that for fixed Mach number, in a dilute gas simulation (Ac = 1), the statistical error 
due to thermal fluctuations cannot be ameliorated by reducing the temperature. However, when the 
Mach number is small enough for compressibility effects to be negligible, favorable relative statistical 
errors may be obtained by performing simulations at an increased Mach number (to a level where 
compressibility effects are still negligible). 

The one-standard-deviation error bar for the sample estimate (u x ) s is o u = \J{bu x )j\[M and the 
fractional error in the estimate of the fluid velocity is 

" Ko| ^MNo~ AcMa^' 1 ' 

yielding 

M = ^- ^— . (7) 

7 Ac 2 Af Ma 2 £;2 v ; 

For example, with Nq = 100 particles in a statistical cell, if a one percent fractional error is desired 
in a Ma = 1 flow, about M = 100 independent statistical samples are required (assuming Ac ~ 1). 
However, for a Ma = 10~ 2 flow, about 10 6 independent samples are needed, which quantifies the 
empirical observation that the resolution of the flow velocity is computationally expensive for low 
Mach number flows. 

Next we turn our attention to the density. From equilibrium statistical mechanics, the fluctuation 
in the number of particles in a cell is 

i^-^(fp) T ^rNff (8, 

where V is the volume of the statistics cell and kt = —V~ 1 (dV/dP)T is the isothermal compressibil- 
ity. Note that for a dilute gas kt = 1/P so {5N 2 ) = N and, in fact, N is Poisson random variable. 
The fractional error in the estimate of the density is 



'' Po No n Vm Vmv VmNq~ 
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where n l T = V/NqUTq is the isothermal compressibility of the reference dilute gas (7* = 7) at the 
same density and temperature. Since a oc 



E = , -— (10) 
p y/MNo Ac v ' 

Note that for fixed M and iVo, the error decreases as the compressibility decreases (i.e., as the sound 
speed increases) since the density fluctuations are smaller. 

Let us now consider the measurement of temperature. First we should remark that the measure- 
ment of instantaneous temperature is subtle, even in a dilute gas. But given that temperature is 
measured correctly, equilibrium statistical mechanics gives the variance in the temperature fluctua- 
tions to be 

<*T 2 > = (11) 

where cy is the heat capacity per particle at constant volume. The fractional error in the estimate 
of the temperature is 

Et _ ££ _ v2p _ 1 [X (12) 

Because the fluctuations are smaller, the error in the temperature is smaller when the heat capacity 
is large. Note that the temperature associated with various degrees of freedom (translational, vibra- 
tional, rotational) may be separately defined and measured. For example, if we consider only the 
measurement of the translational temperature, then the appropriate heat capacity is that of an ideal 
gas with three degrees of freedom, i.e. cy = corresponding to the three translational components. 
Finally, the variance in the pressure fluctuations is 

so the fractional error in the estimate of the pressure is 

where Pq = N^kT^/V is the pressure of an ideal gas under the same conditions. Note that the error 
in the pressure is proportional to the acoustic number while the error in the density, eqn. (|l0|), goes 
as Ac -1 . 



2.2 Shear stress and heat flux for dilute gases 

The thermodynamic results in the previous section are general; in this section we consider transport 
quantities and restrict our analysis to dilute gases. In a dilute gas, the shear stress and heat flux are 
defined as 

TxyO = {r x y) = (pC x Cy) (15) 

and | 

QxO = (qx) = (^PCxC 2 ) (16) 

respectively. This definition neglects the contribution of impulsive collisions to the transport within 
a given volume due to the negligible size of particles in the dilute limit. When, however, simulation 
methods such as DSMC are used to simulate a dilute gas that use a finite molecular size, a small 
error arises when momentum and energy transport is calculated in a volume averaged manner. In 
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fact, this inconsistency has been addressed at the equation of state level by Alexander et al. [15] who 
pointed out that DSMC reproduces an ideal gas equation of state while simulating transport of a 
hard-sphere gas with molecules of finite size. Thus, under the assumption of a very dilute gas, the 
fluctuation in the (equilibrium) shear stress and heat flux in a volume V containing Nq particles can 
be calculated using the Maxwell-Boltzmann distribution. In equilibrium, the expected values of the 
above quantities are zero. 

Using the definitions of shear stress and heat flux in terms of moments of the velocity distribution, 
direct calculation of the variance of the x-y component of the stress tensor based on a single-particle 
distribution function gives 



(r 2 ) 

\ ' xyl 



{(pCxCy) 2 ) 

1 2 {2kT 

4 Po l~^r 



In obtaining the second equation we assumed (c x ) 
component of the heat flux vector, we find 



p2 

1 o 



(c y ) = ((P ~ Po)c x c y ) 



(Ql) 



{{\pc,c 2 ) 2 ) 

Pi <(^c 2 ) 2 ) 

35 2 ( 2kT 
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(17) 
(18) 

(19) 

0. For the x 

(20) 
(21) 

(22) 



where c m = ^/2/cTo/m the most probable particle speed and we have assumed (c x ) = (c y ) = ((p — 
Po)c 2 ) = 0. Note that in equilibrium for a cell containing iVo particles, the variance of the mean is 
given by (Sr^ y ) = (r 2 y ) /Nq and (Sq 2 ) = (q 2 ) /Nq for the shear stress and heat flux respectively. 

In order to derive expressions for the relative fluctuations we need expressions for the magnitude 
of the fluxes. We are only able to provide closed form expressions for the latter in the continuum 
regime where 



ixy 



du T du, 



and 



dy dx 



dT 



dx 



(23) 



(24) 



where p is the coefficient of viscosity and k is the thermal conductivity. Above Knudsen numbers 
of Kn ~ 0.1, it is known that these continuum expressions are only approximate and better results 
are obtained using more complicated formulations from kinetic theory (e.g., Burnett's formulation). 
Here, the Knudsen number is defined as Kn = X/£ and the mean free path as 



A 



C m /i 
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(25) 



Note that this expression for the mean free path simplifies to the hard sphere result when the viscosity 
is taken to be that of hard spheres. 

Using (23) and ( ]24| ) we find that in continuum flows, the relative fluctuations in the shear stress 
and heat flux are given by 
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and 



E„ 



8^35 Pr( 7 - 1) T 1 1 
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AT Kn^ApJ 



dT* 



dx* 



(27) 



respectively. Here, stars denote non-dimensional quantities: u* = uq/u, T* = T/ AT and x* = x/£, 
where u, AT, and I are characteristic velocity, temperature variation and length. The Mach number 
Ma* is defined with respect to the characteristic velocity u rather than the local velocity as in eq. 



If viscous heat generation is responsible for the temperature differences characterized by AT, 
then it is possible to express equation (|27|) in the following form 



E n 



vim 



iV35 Br 



I QxO I 



M -V KuMa- v' AnT/ 



dT* 



dx* 



The Brinkman number 



Br 



(28) 



(29) 



kAT' 

is the relevant non-dimensional group that compares temperature differences due to viscous heat 
generation to the characteristic temperature differences in the flow. (It follows that if viscous heat 
generation is responsible for the temperature changes, Br ~ 1.) 

It is very instructive to extend the above analysis to equation (|l^). If we define the relative error 
in temperature with respect to the temperature changes rather than the absolute temperature, we 
obtain 



-Eat 




(30) 
(31) 



where again, if viscous heat generation is the only source of heat, Br ~ 1. The above development 
shows that resolving the temperature differences or heat flux due to viscous heat generation is very 
computationally inefficient for low speed flow since for a given expected error -Eat we find that the 
number of samples M oc Ma~ 4 . 

Comparison of equations (||) and ( |2"6| ) and equations (12) and (^7j) reveals that 



^~Kn" 



and 



-Eat 
Kn 



(32) 



(33) 



since the non-dimensional gradients will be of order one. As the above equations were derived for 
the continuum regime (Kn < 0.1), it follows that the relative error in these moments is significantly 
higher. This will also be shown to be the case in the next section when the shear stress and heat 
flux are evaluated as fluxal (surface) quantities. This has important consequences in hybrid methods 
|l4| , O ], as coupling in terms of state (Dirichlet) conditions is subject to less variation than coupling 
in terms of flux conditions. 
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2.3 Fluxal quantities 

In this section we predict the relative errors in the fluxes of mass and heat as well as the components 
of the stress tensor, when calculated as fluxes across a reference surface. Our analysis is based on 
the assumption of an infinite, ideal gas in equilibrium. In this case, it is well known that the number 
of particles, iVj, in each infmitessimal cell i is an independent Poisson random variable with mean 
and variance, 

(Ni) = (SNf) = nAxiAyiAzi, (34) 

where n = p/m is the particle number density. 

In the Appendix, it is shown that, more generally, all number fluctuations are independent and 
Poisson distributed. In particular, the number of particles, N^, leaving a cell % at position (xj, yi, Zi), 
crossing a surface, and arriving in another cell j at position (xj,yj, Zj) after a time At (see Figure [j]) 
is also an independent Poisson random variable. Its mean and variance are given by 

(N+) = ((8N+) 2 )=P ij (N i ) (35) 

where 

Pij = P(xij,yij, Zij)AxjAyjAzj, (36) 

is the probability that a particle makes the transition, which depends only upon the relative dis- 
placements, Xij = Xj — Xi, yij = yj — yi, and Zy = Zj — Z\. These properties hold exactly for an 
infinite ideal gas in equilibrium, but they are also excellent approximations for most finite, dilute 
gases, even far from equilibrium, e.g. as shown by the DSMC result in Figure 

Without loss of generality we place a large planar surface of area A at x = 0. The flux in the 
positive direction, J^~, of any velocity-dependent quantity, h(c x , c y , c z ), can be expressed as a sum of 
independent, random contributions from different cells i and j on opposite sides of the surface, 

Xi<0xj>0 Hi Vj Zi Zj 

where h{xij , , Zij) = h(c x ,c y ,c z ) is the same quantity expressed in terms of relative displacements 
during the time interval, At. We begin by calculating the mean flux. Taking expectations in Eq. (|37|), 
we obtain 

(4) = ^ t T.T.T.T.T.T.H^^,z tJ ){N t )p t3 (as) 

Xi<0xj>® Vi yj Zi Zj 

~ a7 ^^Hxij^j^^nAxiPixij^j^^AxjAyjAzj, (39) 

Xi<0xj>0 Vj Zj 

where without loss of generality we have taken y^ = z% = and also separately performed the sums 
parallel to the surface, 

j2Y A y* Az i = A i 

Vi Zi 



by invoking translational invariance and neglecting any edge effects, since (c y ) At, (c z ) At = o(vi) 
in the limit At — > (taken below). 

Passing to the continuum limit in Eq. (^) and letting A — > oo, we arrive at an integral expression 
for the mean flux through an infinite flat surface, 



^ r0 roo roo />oo _ 

( J h) = / dx ' dx d y dz h(x - x',y,z)P(x - x',y,z) 

-oo JO J —oo J —oo 



At 

n 

At 



oo roo roo roo 







dx ds dy dz h(s,y, z)P(s,y, z). (40) 
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Since the integrand does not depend on x, switching the order of integration 

n 

At 



( J h) = T7 / ds I d y I dz h ( s >y> z ) p ( s >y> z ) I dx 



oo J — oo 



yoo poo roc _ g 

= n ds dy dz h(s,y, z)P(s,y, z)— (41) 

JO J-oo J-oo £±t 

produces a simple formula for the mean flux as a conditional average over the velocity distribution 
(with c x > 0), 

(J+) = n (c x h(c x ,c y , c z ))+ (42) 
in the limit At — > 0. For h(c x ,c y ,c z ) = m,mc x ,mcy,mc z , \m(? in the absence of a mean flow, we 



obtain, 



<J+) = n (c x m)+ = ff= (43) 

(J+) = n(c x mc x )+ = ^ = ^ (44) 
(J+) = n(c x mc y ) + = (J+.) = n(c x mc z ) + = (45) 



„3 



<J+) = n ( C:c im C V = ^ (46) 

which are the well-known results for the expected one-sided fluxes of mass, momentum, and energy, 
respectively. 

Using the same formalism as above, we now derive a general formula for the variance of the 
flux, ((SJfr) 2 }- For simplicity, we assume that the mean flow is zero normal to the reference surface, 
but our results below are easily extended to the case of a non-zero mean flow through the surface 
since Eq. (j35[) still holds in that case (see Figure ||). Taking the variance of Eq. (|37| ) and using the 
independence of {N^}, we obtain 

Xi<0xj>0 Hi Vj Zi Zj 

= -Afi E E EEEE H x u>vi3> %) 2 (47) 

XiKOx^O Vi Vj Zi Zj 

Following the same steps above leading from Eq. (Bq) to Eq. (|42|) we arrive at the simple formula, 



((^) 2 ) = ^ t (cxh(c x ,c y ,c z ) 2 ) + (48) 



in the continuum limit as At — > 0. The variance of the total flux is 

r2\ _ hx t+ i x t-\2\ _ // x T+^2\ i i/x t-\2 



m) = ( w + M£T> = (Wn + (to = 2< to (49) 

h and J /T 



since the one-sided fluxes through the surface, and J, , are independent and identically distributed 



(with opposite sign). 

Using this general result, we can evaluate the standard deviations of the fluxes above, 



<"™> = t/wVas (50) 



r3 
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These formulae may be simplified by noting that 



m 



AAt 2/tt 



(53) 



where N + is the mean total number of particles crossing the reference surface in one direction in 
time At,, which yields 



(SJl 



1 



(SJi 




xyl 




(54) 
(55) 
(56) 

./■/// v'''.s-!ji <wid 

(5(ql) 2 } = (SJ 2 ), and additionally include the effect of M (independent) samples in time. The 
superscript / denotes fluxal measurement. By noting that transport fluxes are defined with respect 
to the rest frame of the fluid, it can be easily verified that the above relations hold in the case where 
a mean flow in directions parallel to the measuring surface exists, under the assumption of a local 
equilibrium distribution. 

We can derive expressions for the relative expected error in the continuum regime in which models 
exist for the shear stress and heat flux. In this regime we find 



We may relate these to the results from the previous section by identifying (S(tJL) 2 
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5-Ky/2r/ Kn Ma* y/MN^ 



du* dv* 
dy* dx* 



(57) 



and 
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577T Kn Ma* y/MN- 



dT 



dx* 



-i 



(58) 



(59) 



Comparing ( |57| ) with the corresponding expressions for volume- averaged stress tensor, (^), one finds 
that, aside from the numerical coefficients, the expressions differ only in the number of particles used, 
either N + or Nq; one finds a similar result for the heat flux. 



2.4 Connection to Fluctuating Hydrodynamics 

Fluctuating hydrodynamics, as developed by Landau, approximates the stress tensor and heat flux 
as white noises, with variances fixed by matching equilibrium fluctuations. In this section we identify 



the connection between Landau's theory and the variances of fluxes obtained in section 2.2. 

Landau introduced fluctuations into the hydrodynamic equations by adding white noise terms to 
the stress tensor and heat flux jTTfl (in the spirit of Langevin's theory of Brownian motion [|l9|). The 
amplitudes of these noises are fixed by evaluating the resulting variances of velocity and temperature 
and matching with the results from equilibrium statistical mechanics fl8[| . For example, in Landau's 
formulation the total heat flux in the x-direction is = —ndT/dx + g x where the first term on the 
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r.h.s. is the deterministic part of the flux and the second is the white noise term. The latter has 
mean zero and time correlation given by the following expression 

9ki<T 2 

{g x {t)g x (t')) = ^-S(t-t') (60) 

Note that at a steady state the deterministic part is constant so (5q x (t)5q x (t')) = (g x (t)g x (t')). 
On the other hand, recall from eqn. (p^), 

The question naturally arises: how does one reconcile ( |60| ) and (|6l|)? Note that the fluctuating 
hydrodynamics expression contains the thermal conductivity, which depends on the particle interac- 
tion (e.g., for hard-spheres k depends on the particle diameter) while the kinetic theory expression 
is independent of this interaction. 

The key lies in identifying the <5-function with a decay time, tj, that is vanishingly small at 
hydrodynamic scales. Specifically, we may write 

' " " otherwise ' ' 



so 



Comparing with the above gives 



t d V 2kN cf a t d N 

16 kV . A . 

td= 35kN^ (64) 



For a hard sphere gas, k = |cy/i so using fl2q) we may write this as, 

t d = ^- (65) 
28 c m v ; 

For other particle interactions the coefficients will be slightly different but in general ~ A/c m , 
thus it is approximately equal to the molecular collision time. In conclusion, the two formulations 
are compatible once the white noise approximation in fluctuating hydrodynamics is justified by the 
fact that the hydrodynamic time scale is much longer than the kinetic (i.e., collisional) time scale. 
Landau's construction provides a useful hydrodynamic approximation for g x but ([H|) is the actual 
variance of the heat flux. 



3 Simulations 
3.1 Dilute Gases 

We performed DSMC simulations to verify the validity of the expressions derived above. Standard 
DSMC techniques ||, || were used to simulate flow of gaseous argon (molecular mass m = 6.63 x 
10 -26 kg, hard sphere diameter a = 3.66 x 10~ 10 m) in a two-dimensional channel (length L and 
height H). The simulation was periodic in the x direction (along the channel axis). The two walls 
at y = —H/2 and y = H/2 were fully accommodating and flat. The simulation was also periodic in 
the third (homogeneous) direction. 

The average gas density was po = 1.78kg/m 3 and in all calculations over 40 particles per cell 
were used. The cell size was Ax = Aq/3 where Aq is the reference mean free path. The time step 
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was At = Ao/(7.5c m ). For a discussion of the errors resulting from finite cell sizes and time steps 
see [||, |9[ . The fractional error in the simulations is obtained from the standard deviation of cell 
values in the x and z directions. To ensure that the samples were independent, samples were taken 
only once every 250 time steps. To ensure that the system was in its steady state the simulation was 
run for 10 6 time steps before sampling was started. 

A constant acceleration was applied to the particles to produce Poiseuille flow in the x direction 
with maximum velocity at the centerline u™^ ~ 2 m/s. Figures [|, |f| and || show good agreement 
between the theoretical expressions from section 2 and simulation measurements for the fractional 
error in velocity, density and temperature, respectively. The fractional error in the velocity measure- 
ment is minimum at the centerline since the Poiseuille velocity profile is parabolic and maximum at 
the centerline, (see Fig. |]). The density and temperature were nearly constant across the system so 
the fractional errors in these quantities are also nearly constant. 

The expressions for shear stress and heat flux were verified using Couette (walls at equal tem- 
perature with different velocities) and "temperature" Couette (walls at zero velocity with different 
temperatures) calculations respectively. In these calculations, very small cell sizes (Ax = A/6) and 
time steps (At = A/(30c m )) were used in order to minimize the discrepancy between the volume- 
averaged and surface-averaged shear stress and heat flux ||]. The system was equilibrated for 10 6 
time steps and samples were taken every 50 time steps. The momentum and energy fluxes de- 
correlate faster than the conserved hydrodynamic variables, such as density, so independent samples 
are obtained after fewer time steps (see section 4). Good agreement is found between the theoretical 
results and simulation measurements for volume averaged and fluxal quantities, as shown in figures 

0, 1, 1 M 

A final note: In DSMC simulations one considers each particle as "representing" a large number 
of molecules in the physical system. In all the expressions given above, No and iV + relates to the 
number of particles used by the simulation so the fluctuations can be reduced by using larger numbers 
of particles (i.e., using a lower molecule-to-particle ratio). 



3.2 Dense fluids 

We performed molecular dynamics simulations to test the validity of equations (||), (|i~0|), (|i~2] ) for dense 
fluids. A similar geometry to the dilute gas simulations described above was used but at a significantly 
higher density. In particular, we simulated liquid argon (cjlj = 3.4 x 10 _10 m, elj = 119.8fc^) at 
T = 240-fT and p = 860Kg/m 3 in a two-dimensional channel with the x and z directions periodic. 
The channel height was H = 69.7<T£j. The wall molecules were connected to fee lattice sites through 
springs and interacted with the fluid through a Lennard- Jones potential with the same parameters. 
The spring constant k s = 460ea~ 2 was chosen in the way that root mean square displacement of wall 
atoms around their equilibrium position at the simulated temperature was well below the Lindermann 
criterion for the melting point of a solid. The length and depth of the system was 28au and 29.1<tlj 
in the x and z directions respectively. A constant force / = 8 x 10 -5 e/<7£j per particle was used to 
generate a velocity field with a maximum velocity of approximately 13 m/s. 

In order to calculate the fluctuation of density, temperature and velocity, we divided the simu- 
lation cell into 13 layers in the y— direction with a height Ay = 4.6396c. We further divided each 
layer into 49 cells, 7 in each of the x and z directions. The density, temperature and velocity in 
each cell were calculated in every 2000(0. 005tij), where thj = \JfnL.j^\j/^Lj- We have checked 
that this time interval is longer than the system's correlation time such that samples taken between 
such intervals are independent. For each cell, 200 samples are used to calculate the average density, 
temperature and velocity. The fluctuation was calculated for each layer using the 49 equivalent cells 
in the x — z plane. 

Due to the sensitivity of the compressibility kt on the interaction cutoff r c , a rather conservative 
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value of r c = A.Octlj was used. We also introduced a correction for the still-finite cutoff which used 
the compressibility predictions of the Modified Benedict- Webb- Rubin equation of state [13]. The 
agreement between the theoretical predictions and the simulations is good (see Figures 11, 12 and 

m. 



4 Independent Samples and Correlations 

The results in figures ^, || ||, [R] suggest that volume-averaged measurements provide a superior 
performance due to a smaller relative error. This conclusion, however, is not necessarily correct 
because our results are based on arbitrary choices of "measurement spacing", in the sense that the 
only consideration was to eliminate correlations in the data since the theoretical formulation in section 
[2| is based on the assumption of uncorrelated samples. The two methods of sampling are in fact linked 
by a very interesting interplay between the roles of time and space: fluxal sampling is a measurement 
at a fixed position in space for a period of time, whereas volume sampling is performed over some 
region of space at a fixed time. The theoretical performance of each method can be increased by 
extending the respective window of observation. However, by increasing the time of observation, a 
fluxal measurement becomes correlated with neighboring measurements if the same particle crosses 
more than one measuring station within the period of observation. Similarly, by increasing the region 
of measurement, subsequent volume measurements will suffer from time correlations if previously 
interrogated particles do not have sufficient time to leave the measurement volume. This relation 
between spatial and temporal sampling and the role of the particle characteristic velocity is also 
manifested in the theoretical predictions. Enforcing equality of the variances of the respective volume 



and fluxal measurements (eqs (55) and (26), and eqs (|5q) and (|27|)) yields Ax = /3c m Ai where (3 » 1. 
The generalization of this work to include time and spatial correlations is the subject of future work. 

The effect of time correlation in volume measurements can be approximated using the theory 
of the "persistent random walk" pi], first introduced by Fiirth |22| and Taylor Q. A persistent 
random walk is one in which each step displacement, u^) = u(rjAt), is identically distributed and 
has a positive correlation coefficient with the previous step, 

{ou z } 

(0 < a < 1), e.g. to model diffusion in a turbulent fluid ]/], [23||. This assumption implies that step 
correlations decay exponentially in time, 

(Sum) mo)) _ v _ e -t v/tc (67) 

(5u 2 ) ' 1 ' 

where t c = —At/ log a is the correlation time, beyond which the steps are essentially indepedent. 
The position, U(t„), of the random walker after r\ steps is the sum of these correlated random 
displacements, U(t v ) = J2]=i u (tj)- Following Taylor, it is straightforward to show that for times 
long compared to the correlation time (t„ 3> t c ), the usual diffusive scaling holds 

(5U(t v ) 2 ) ~ (5u 2 ) r] (i±fi) , (68) 

where the bare diffusion coefficient is modified by the term in parentheses due to correlations. 

There is a natural connection with the measurement of statistical averages. If we view each step 
in the persistent random walk as a correlated sample of some quantity in the gas, then the position of 
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the walker (divided by the number of samples) corresponds to the sample average. Thus the variance 
of a set of sequentially correlated random variables {u(t v )}, Var c (u), may be written as 

1 + a 

Var c (u) = Var(u)—^- (69) 
1 — a 

where Var(u) is the variance of the uncorrelated samples (a = 0). The theory above implies that the 
sample variance is amplified by the presence of correlations, because effectively fewer independent 
samples have been taken compared to the uncorrelated case, a = 0. 

Note that a sequence of correlated random variables, {u(t v )}, satisfying Eqs. ( |66| ) and ( |67|) can be 
explicitly constructed from a sequence of independent, identically distributed variables, {u(t v )}, by 
letting uitrj) equal the previous value u(t v -i) with probability a or a new value u(t v ) with probability 
1 — a. This allows us to interpret a as the probability that a sample is the same as the previous one, 
which is precisely the source of correlations when volume-sampling dilute gases. 

There are two distinct ways that a new sample of a dilute gas can actually provide new informa- 
tion: (i) Either some new particles have entered the sampling cell, or (ii) some particles previously 
inside the cell have changed their properties due to collisions. Regarding (i), the probability that a 
particle will remain in a cell of size Ax after time step At is given by 



I rAx/2 rAx/2 - x 
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v / / f(^)dxdx) (70) 

Ax J -Ax/2 J -Ax/2 \ At J J 



where d is the dimensionality of the cell and /(cj) is the probability distribution function of the 
particle velocity, and thus /(^r) represents the probability that a particle originating from location 
x is found at x after a time interval, At. Note that the above expression holds when the components of 
the particle velocity in different directions are uncorrelated and that the particle spatial distribution 
inside the cell is initially uniform. Regarding (ii), if the single-particle auto-correlation function 
decays exponentially, the "renewal" probability 1 — a c for collisional de-correlation can, at least in 
principle, be inferred from simulations using Eq. (|67|). The net correlation coefficient, giving the 
probability of an "identical" sample, is a = a c a\. 

In our DSMC simulations, the single-particle correlation coefficient for the velocity was estimated 



to be a c ~ 0.94 for At = A/(15c m ) by fitting Eq. (67) to data; the same value for a c was found 
in both equilibrium and non-equilibrium simulations. In the comparison of figures m, m and m, 
however, we use a c = 1 since mass, momentum and energy are conserved during collisions. We find 
that this analysis produces very good results in the case of density and temperature (see figures ^ 
and |l^) and acceptable results for the case of mean velocity (see figure |l4|). Use of a c = 0.94 would 
tend to make the agreement better but no theoretical justification exists for it. The value of a; was 
directly calculated by assuming an equilibrium distribution. For our two-dimensional calculations 
with Ax/(c m Ai) = 2.5, we find a\ ~ 0.6. The effect of a mean flow is very small if the Mach number 
is small. This was verified through both direct evaluation of Eq. (|70| ) using a local equilibrium 
distribution function and DSMC simulations. 



5 Conclusions 

We have presented expressions for the statistical error in estimating the velocity, density, temperature 
and pressure in molecular simulations. These expressions were validated for flow of a dilute gas and 
dense liquid in a two-dimensional channel using the direct simulation Monte Carlo and Molecular 
Dynamics respectively. Despite the non-equilibrium nature of the validation experiments, good 
agreement is found between theory and simulation, verifying that modifications to non-equilibrium 
results are very small. In particular, in the dense fluid case, despite the significant non-equilibrium 
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due to a shear of the order of 5.5 x 10 s , the agreement with equilibrium theory is remarkable. 
We thus expect these results to hold for general non-equilibrium applications of interest. 

Predictions were also presented for the statistical error in estimating the shear stress and heat flux 
in dilute gases through cell averaging and surface averaging. Comparison with direct Monte Carlo 
simulations shows that the equilibrium assumption is justified. Within the same sets of assumptions, 
we were able to show that the distribution of particles leaving a cell is Poisson-distributed. 

One consideration that significantly limits the applicability of cell (volume) averaging for trans- 
port quantities is their neglect of transport due to collisions in the cell volume. In DSMC in particular, 
as the time step of the simulation is increased, the fraction of particles in a cell undergoing collision 
increases, and as a result, the error from using the above method increases. 

It was found that the fluctuation in state variables is significantly smaller compared to flux 
variables in the continuum regime Kn — > 0. This is important for the development of hybrid 
methods. Although a direct comparison was only presented between volume averaged quantities, we 
find that the fluxal measurements for the shear stress and heat flux perform similarly to the volume 
averaged counterparts (regarding scaling with the Knudsen number). 

Appendix: Number Fluctuations in Dilute Gases 

Consider an infinite ideal gas in equilibrium with mean number density, n. By definition, each 
infinitessimal volume element, dV, contains a particle with probability, ndV, and each such event in 
independent. From these assumptions, it is straightforward to show that the number of particles N 
in an arbitrary volume V is a Poisson random variable |12| , 



Note that this result does not strictly apply to a finite ideal gas because the probabilities of finding 
particles in different infinitessimal volumes are no longer independent, due to the global constraint 
of a fixed total number of particles. Nevertheless, it is an excellent approximation for most dilute 
gases, even finite non-ideal gases far from equilibrium (e.g. as demonstrated by simulations in the 
main text). 

Now consider a property A which each particle in a volume V may possess independently with 
probability, Pa- In this Appendix, we show that the distribution of the number of such particles, 
A^4, is also a Poisson random variable with mean and variance given by 



For example, in the main text we require the number of particles, Nij, which travel from one region, i, 
to another region, j, in a time interval, dt. The proof given here, however, is much more general and 
applies to arbitrary number fluctuations of an infinite ideal gas in equilibrium, such as the number 
of particles in a certain region, moving in a certain direction, of a certain "color" , with speeds above 
a certain theshold, etc. 

We begin by expressing Na as a random sum of random variables, 




with mean and variance given by 



(AT) 



(5N 2 ) = nV. 



(N A ) = (SNl) = P A (N) = P A nV. 



N 




(71) 
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where 

{1 if A occurs 
otherwise. 

is an indicator function for particle i to possess property A, which is a Bernoulli random variable 
with mean, Pa- It is convenient to introduce probability generating functions, 

oo 

f x (z) = £ Prob( X = m)z m = (1 - P A ) + P A z 

m=0 

and 

00 00 „-(N) /i\T\m m 

f N (z) = Y Prob(iV = m)z m = V S£LLL. = e W(*-l) (72) 

n n 
m=u m=U 

because the generating function for a random sum of random variables, as in Eq. (|7l]), is simply 
given by a composition of the generating functions for the summand and the number of terms jl^] , 



fN A (*) = E Prob(iV A = m)z m = f N (f x (z)) 



m=0 

Combining these expressions we have 

f NA ( z ) = e (NKl-P A +P A z-l) = e P A (N)( Z -l)_ 



Comparing with Eq. (72) completes the proof that Na is a Poisson random variable with mean, 
Pa(N). 
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Figure 1: Schematic particle motion crossing surface A. 
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Figure 2: Probability disrtibution of number of particles crossing a surface in the gas. The circles 
denote equilibrium DSMC results and the solid line shows a Poisson distribution with mean (N) 
based on the equilibrium result (J + ) = nc m /2y / 7r. 
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Figure 3: Probability disrtibution of number of particles crossing a surface in the gas in the presence 
of mean flow normal to the surface of interest. The circles denote equilibrium DSMC results and the 
solid line shows a Poisson distribution with mean (N) based on the particle flux in the presence of 
a mean flow. 
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Figure 4: Fractional error in velocity for Poiseuille flow in a channel as a function of the transverse 
channel coordinate, y. The dashed line denotes equation @ and the solid line denotes DSMC 
simulation results. 
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Figure 5: Fractional error in density for Poiseuille flow in a channel as a function of the transverse 
channel coordinate, y. The dashed line denotes equation ( |T0| ) and the solid line indicates DSMC 
simulation results. 
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Figure 6: Fractional error in temperature for Poiseuille flow in a channel as a function of the 
transverse channel coordinate, y. The dashed line denotes equation ( |T2| ) and the solid line indicates 
DSMC simulation results. 
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Figure 7: Fractional error in the shear stress r xy for Couette flow in a channel as a function of the 
transverse channel coordinate, y. The dashed line denotes equation (|26| ) and the solid line indicates 
DSMC simulation results. 
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Figure 8: Fractional error in the shear stress for Couette flow in a channel as a function of the 
transverse channel coordinate, y. The dashed line denotes equation ( |57| ) and the solid line indicates 
DSMC simulation results. 
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Figure 9: Fractional error in the heat flux for "Tempearture Couette" in a channel as a function 
of the transverse channel coordinate, y. The dashed line denotes equation (|27| ) and the solid line 
indicates DSMC simulation results. 
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Figure 10: Fractional error in the heat flux for "Temperature Couette" in a channel as a function 
of the transverse channel coordinate, y. The dashed line denotes equation ( ]59| ) and the solid line 
indicates DSMC simulation results. 
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Figure 11: Fractional error in velocity for dense- fluid Poiseuille flow in a channel as a function of the 
transverse channel coordinate, y. The dashed line denotes equation (^) and the solid line denotes 
MD simulation results. 



0.014 - 

Ex 

0.013 - 




y(m) 

Figure 12: Fractional error in temperature for dense- fluid Poiseuille flow in a channel as a function 
of the transverse channel coordinate, y. The dashed line denotes equation (|l2| ) and the solid line 
indicates MD simulation results. 
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Figure 13: Fractional error in density for dense-fluid Poiseuille flow in a channel as a function of the 
transverse channel coordinate, y. The dashed line denotes equation ( |l0| ) and the solid line indicates 
MD simulation results. 
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Figure 14: Fractional error in velocity for Poiseuille flow in a channel as a function of the transverse 
channel coordinate, y. The dashed line denotes the theoretical prediction including correlations and 
the solid line denotes DSMC simulation results. 
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Figure 15: Fractional error in density for Poiseuille flow in a channel as a function of the transverse 
channel coordinate, y. The dashed line denotes the theoretical prediction including correlations and 
the solid line indicates DSMC simulation results. 
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Figure 16: Fractional error in temperature for Poiseuille flow in a channel as a function of the trans- 
verse channel coordinate, y. The dashed line denotes the theoretical prediction including correlations 
and the solid line indicates DSMC simulation results. 
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